Automatic Summarization for Chinese Text Using Affinity Propagation Clustering and Latent Semantic Analysis

نویسندگان

  • Rui Yang
  • Zhan Bu
  • Zhengyou Xia
چکیده

As the rapid development of the internet, we can collect more and more information. it also means we need the abitily to search the information which really useful to us from the amount of information quickly. Automatic summarization is useful to us for handling the huge amount of text information in the Web. This paper proposes a Chinese summarization method based on Affinity Propagation(AP)clustering and latent semantic analysis(LSA). AP is a new clustering algorithm raised by B. J. Frey on science in 2007 that takes as input measures of similarity between pairs of data points and simultaneously considers all data points as potential exemplars. LSA is a technique in natural language processing, in particular in vectorial semantics, of analyzing relationships between a set of sentences. Experiment results show that our method could get more comprehensive and high-quality summarization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Chinese Dialogue Text Summarization Based On LSA and Segmentation

Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis (LSA) is first used to extract semantic knowledge from a given document, all question paragraphs are identified, an approach of automatic text segmentation analogous to TextTiling is exploited to improve the precision of correlating question paragraphs and answer pa...

متن کامل

Arabic text summarization based on latent semantic analysis to enhance arabic documents clustering

Arabic Documents Clustering is an important task for obtaining good results with the traditional Information Retrieval (IR) systems especially with the rapid growth of the number of online documents present in Arabic language. Documents clustering aim to automatically group similar documents in one cluster using different similarity/distance measures. This task is often affected by the document...

متن کامل

Summarizing Disasters Over Time

We have developed a text summarization system that can generate summaries over time from web crawls on disasters. We show that our method of identifying exemplar sentences for a summary using affinity propagation clustering produces better summaries than clustering based on K-medoids as measured using Rouge on a small set of examples. A key component of our approach is the prediction of salient...

متن کامل

Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis

In this paper, two novel approaches are proposed to extract important sentences from a document to create its summary. The first is a corpus-based approach using feature analysis. It brings up three new ideas: 1) to employ ranked position to emphasize the significance of sentence position, 2) to reshape word unit to achieve higher accuracy of keyword importance, and 3) to train a score function...

متن کامل

A New Approach to Automatic Summarization by Using Latent Dirichlet Allocation in Conditional Random Field

A New Approach to Automatic Summarization by Using Latent Dirichlet Allocation in Conditional Random Field Xiaofeng Wu, Chengqing Zong (National Lab of Pattern Recognition, Institute of Automation, CAS, Beijing 100190, China) Abustract: In recent years, Latent Dirichlet Allocation(LDA) has been used more and more in Document Clustering, Classification, Segmentation, and some one has used it in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012